Text-independent Speaker Identification Based on MAP Channel Compensation and Pitch-dependent Features
نویسندگان
چکیده
One major source of performance decline in speaker recognition system is channel mismatch between training and testing. This paper focuses on improving channel robustness of speaker recognition system in two aspects of channel compensation technique and channel robust features. The system is text-independent speaker identification system based on two-stage recognition. In the aspect of channel compensation technique, this paper applies MAP (Maximum A Posterior Probability) channel compensation technique, which was used in speech recognition, to speaker recognition system. In the aspect of channel robust features, this paper introduces pitch-dependent features and pitch-dependent speaker model for the second stage recognition. Based on the first stage recognition to testing speech using GMM (Gaussian Mixture Model), the system uses GMM scores to decide if it needs to be recognized again. If it needs to, the system selects a few speakers from all of the speakers who participate in the first stage recognition for the second stage recognition. For each selected speaker, the system obtains 3 pitch-dependent results from his pitch-dependent speaker model, and then uses ANN (Artificial Neural Network) to unite the 3 pitch-dependent results and 1 GMM score for getting a fused result. The system makes the second stage recognition based on these fused results. The experiments show that the correct rate of two-stage recognition system based on MAP channel compensation technique and pitch-dependent features is 41.7% better than the baseline system for closed-set test. Keywords—Channel Compensation, Channel Robustness, MAP, Speaker Identification
منابع مشابه
Minimum classification error training for speaker identification using Gaussian mixture models based on multi-space probability distribution
In our previous work, we have proposed a speaker modeling technique using spectral and pitch features for text-independent speaker identification based on Multi-Space Probability Distribution Gaussian Mixture Models (MSD-GMMs). We have presented a maximum likelihood (ML) estimation procedure for the MSD-GMM parameters and demonstrated its high recognition performance. In this paper, we describe...
متن کاملSpeaker identification using Gaussian mixture models based on multi-space probability distribution
This paper presents a new approach to modeling speech spectra and pitch for text-independent speaker identification using Gaussian mixture models based on multi-space probability distribution (MSD-GMM). The MSD-GMM allows us to model continuous pitch values for voiced frames and discrete symbols representing unvoiced frames in a unified framework. Spectral and pitch features are jointly modeled...
متن کاملFactor analysis based channel compensation in speaker verification
This report describes a powerful channel compensation method for the text-independent speaker verification task. This powerful method is developed in the LRDE Speaker Verification framework. The purpose of a text-independent speaker verification system is to check whether a hypothesised speaker is really the author of a speech utterance. The channel compensation problem arises when training dat...
متن کاملTelephone-based Text-dependent Speaker Verification
TELEPHO E-BASED TEXT-DEPE DE T SPEAKER VERIFICATIO In this thesis, we investigate model selection and channel variability issues on telephone-based text-dependent speaker verification applications. Due to the lack of an appropriate database for the task, we collected two multi-channel speaker recognition databases which are referred to as text-dependent variable text (TDVT-D) and textdependent ...
متن کاملText-independent Speaker Identification System Using Average Pitch and Formant Analysis
The aim of this paper is to design a closed-set text-independent Speaker Identification system using average pitch and speech features from formant analysis. The speech features represented by the speech signal are potentially characterized by formant analysis (Power Spectral Density). In this paper we have designed two methods: one for average pitch estimation based on Autocorrelation and othe...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2008